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Objectives: The aim of this study was to establish a prediction model of medication adherence in elderly patients with chron- 
ic diseases and to identify variables showing the highest classification accuracy of medication adherence in elderly patients 
with chronic diseases using support vector machine (SVM) and conventional statistical methods, such as logistic regression 
(LR). Methods: We included 293 chronic disease patients older than 65 years treated at one tertiary hospital. For the medi- 
cation adherence, Morisky's self-report was used. Data were collected through face-to-face interviews. The mean age of the 
patients was 73.8 years. The classification process was performed with LR (SPSS ver. 20.0) and SVM (MATLAB ver. 7.12) 
method. Results: Taking into account 16 variables as predictors, the result of applying LR and SVM classification accuracy 
was 71.1% and 97.3%, respectively. We listed the top nine variables selected by SVM, and the accuracy using a single variable, 
self-efficacy, was 72.4%. The results suggest that self-efficacy is a key factor to medication adherence among a Korean elderly 
population both in LR and SVM. Conclusions: Medication non-adherence was strongly associated with self-efficacy Also, 
modifiable factors such as depression, health literacy, and medication knowledge associated with medication non-adherence 
were identified. Since SVM builds an optimal classifier to minimize empirical classification errors in discriminating between 
patient samples, it could achieve a higher accuracy with the smaller number of variables than the number of variables used 
in LR. Further applications of our approach in areas of complex diseases, treatment will provide uncharted potentials to re- 
searchers in the domains. 
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I. Introduction 

As the elderly population grows, the prevalence of aging-re- 
lated diseases and drug expenditures has increased in Korea. 
The Korean population over age 65 was 11.0% of the total 
population in the year 2010, and is projected to be 14.3% by 
2018 and 20.8% by 2026 [1]. 

Older adults that have various chronic diseases requiring 
polypharmacy often do not take medications as prescribed 
by their health care providers, with reported rates of non- 
compliance ranging from 40%-50% [2]. Medication non- 
adherence lowers the effectiveness of treatments and raises 
medical costs. Therefore, non-adherence is an important 
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issue in the management of patients with chronic diseases. 

Elderly patients' adherence to prescribed medications is 
a complex phenomenon that depends on an interaction of 
socio-demographic, medication, and psychological factors. 
In previous studies, factors attributed to non-adherence in- 
cluded access to medicines, polypharmacy, multiple morbid- 
ity [3], complexity of regimens [4], as well as poor commu- 
nication between prescriber and patient. Numerous studies 
have explored the potential predictors of adherence to medi- 
cations across a variety of conditions. However, little data is 
available regarding the decline in adherence over time and 
its associated risk factors. Recently, some studies have begun 
to explore more modifiable predictors of adherence, such as 
depression, medication knowledge, health literacy, and self- 
efficacy [5,6] . 

Predictive models are used in a variety of medical domains 
for diagnostic and prognostic tasks. An increasingly large 
number of data items are collected routinely, and often au- 
tomatically, in many areas of medicine. It is a challenge for 
the field of machine learning and statistics to extract useful 
information and knowledge from this wealth of data [7] . 

The support vector machine (SVM) is a relatively new clas- 
sification or prediction method developed as a result of the 
collaboration between the statistical and the machine learn- 
ing research community. 

Today, SVM has become an important issue, equal to the 
previous neural network algorithm in the machine learning 
field. Examples of applications using SVM include character 
recognition, voice recognition, face detection, document 
retrieval, image recognition, medical diagnostics, mortality 
prediction, and analysis of bioinformatics and genetics. SVM 
is used in many other areas. 

The heuristic behind the SVM algorithm is quite different 
from that of the commonly used logistic regression (LR) 
modeling for prediction. The LR algorithm uses a weighted 
least squares algorithm. SVM, in contrast, tries to model the 
input variables by finding the separating boundary — called 
the hyperplane — to reach classification of the input variables 
by mathematically transforming the input variables [8]. 

In the classification problems, the LR and SVM were com- 
pared in several papers; SVM generally showed equal or 
superior performance than LR [7,9]. SVM is especially suit- 
able for the analysis of large amounts of biomedical data that 
comprise a small number of records and a large number of 
variables [10]. 

In previous studies, classification algorithms and pattern 
analysis were mainly focused on Bayesian and artificial neu- 
ral networks in the broad healthcare domain. To date, SVM 
has not yet been studied in terms of the prediction of medi- 



Healthcare Informatics Research J~HR 

cation adherence in elderly patients with chronic diseases 
in Korea or other countries. The current study was the first 
attempt to investigate the use of an SVM-based classification 
model for determining the predictors of medication adher- 
ence in elderly patients with chronic diseases. 
The purposes of this study were to identify the factors influ- 
encing medication adherence and to compare the accuracy 
of LR- and SVM-based models in predicting medication ad- 
herence in elderly patients with chronic diseases. 

II. Methods 

1. Data Collection and Preparation 

This cross-sectional descriptive survey was undertaken at 
outpatient clinics at a teaching hospital in Cheonan, Korea. 
We used sample data from January to May 201 1 of 293 pa- 
tients over 65 years of age with chronic disease. We included 
elderly patients who had been taking a medication for lon- 
ger than 6 months and had asthma, hypertension, diabetes, 
chronic obstructive pulmonary disease, liver cirrhosis, 
stroke, and cardiovascular diseases with normal cognitive 
function. The study was approved by the ethics committee of 
the hospital prior to the start of data collection. Written con- 
sent was obtained. The questionnaire was verbally admin- 
istered to consenting respondents who were unable to self- 
complete the survey. 

Sixteen variables were used: age, gender, job, monthly 
income, spouse, educational level, activities of daily living 
(ADL), perceived health status, duration after diagnosis, 
number of medication types, daily pill counts, side effects 
of medication, self- efficacy, depression, health literacy, and 
medication knowledge. The variable sets used in this study 
are shown in Table 1. 

2. Measurements 

The questionnaire was designed to yield information about 
demographic characteristics such as age, gender, educational 
level, spouse, and monthly income. Depression was assessed 
by means of the short-form of the Geriatric Depression Scale 
(GDS), which is a validated 15-item, self-report depressive 
symptom scale designed to detect the presence of current 
depression in older adults [11] (Cronbachs a = 0.91). To 
assess patients' medication knowledge, we used five self- 
report questions using a 5-point Likert scale [12] (Cronbachs 
a = 0.79). Health literacy indicates an individual's ability 
to obtain and use health information to make appropriate 
decisions for health and medical care. For health literacy, pa- 
tients were asked three previously described health literacy 
screening questions, each with five possible response options 
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Table 1. Description of variables 
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Spouse 


No (l),yes (2) 


Education level 


Illiteracy (1), elementary school (2), above middle school (3) 


Monthly income 


Less than one-million won (1), more than one-million won (2) 


Job 


No (1), yes (2) 


Duration after diagnosis 


Less than 5 years (1), 5-10 years (2), over 10 years (3) 


Medication knowledge 


How well the patients knew the names, purposes, recommended doses, frequencies, and side 

effects of their medications 
5 self-reported questions with a 5-point Likert scale (5—25) 


No. of medication type 


1-2 kinds (1), 3-4 kinds (2), 5 or more kinds (3) 
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Side effect of medication 


No (0),yes (1) 


Activities of daily living 


rassive (1 j, active (z) 

Daily self-care capabilities at home and in outdoor environments 


Perceived health status 


Bad (1), fair (2), good (3) 
Self-reported his or her health status 


Depression 


Short-form of the geriatric depression scale 

15 sell-reported questions with yes (l)/no (0) scale (0-15) 


Health literacy 


Individual's ability to obtain and use health information to make appropriate decisions for 

health and medical care 
3 self-reported questions with 5-point scale (3-15) 


Self-efficacy 


Patient's belief in his/her ability to succeed in adhering to the prescription medication 
13 self-reported questions with 3-point scale (13-39) 



[13] (Cronbach's a = 0.81). Perceived health status refers 
to self-reported status, ADL and daily self-care capabilities 
at home and in outdoor environments, respectively. In this 
study, self-efficacy indicated a patient's belief in his/her abili- 
ty to succeed in adhering to the prescription medication. We 
used a previously described medication self-efficacy system 
[14] using 13 items with good internal consistency reliability 
(Cronbach's a = 0.89). 

Lastly, medication adherence was determined using a 
modified version of the four items, self-reported Morisky 
medication adherence scale [15]. Each item was in a yes/no 
format with each item is in a yes/no format with a maximum 
possible score of four equating very poor adherence and 0 
or 1 typically considered as good adherence. The Morisky 
scale has been used across many chronic diseases, as a self- 
reported measure of adherence to medications and has dem- 
onstrated good reliability and predictive validity [16]. Scores 
<2 were considered indicative of non-adherence to medica- 
tions. 



3. Variable Selection 

Sixteen variables were used in the model building and analy- 
sis. The variables were selected because they either had been 
shown to have an impact on medication adherence in previ- 
ous research [17] or were of potential clinical importance as 
indicated by a panel of experts. The variables were gender, 
age, job, educational level, side effects of medication, depres- 
sion, health literacy, monthly income, spouse, and duration 
after diagnosis, medication knowledge, number of medica- 
tion types, daily pill counts, perceived health status, ADL, 
and self-efficacy. Patients were asked about medication ad- 
herence using a self-reported questionnaire. The dataset was 
divided in a group of 120 adherent patients and 173 non- 
adherent patients. 

4. Comparison Between Prediction Models 

The LR model building processes were carried out in SPSS 
ver. 20.0 (IBM, Armonk, NY, USA). A p-value < 0.05 was 
considered to be significant for inclusion into the model. We 
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checked multicollinearity to verify the adequacy of the re- 
gression model. The identified variation inflation factor (VIF) 
value of less 10 confirms the absence of collinearity (in the 
data). Moreover, the Durbin- Watson test for residual analysis 
showed a value of less than 2 (1.85), effectively demonstrat- 
ing that there was no correlation between the error terms of 
the model, which subsequently satisfied the assumption of 
normal distribution of residuals. Therefore, it entailed that 
the data under study was suitable for regression analysis. 

The SVM-based model building processes were carried out 
with MATLAB ver. 7.12 (Math Works, Natick, MA, USA). 
We used SVM with radial basis function (RBF) as kernels. 

Comparison of LR and SVM discrimination for both mod- 
els was performed. Five widely used statistics were adopted 
to evaluate the performance of a model: sensitivity, specific- 
ity, positive predictive value (PPV), negative predictive value 
(NPV), and accuracy. To test the ability of each model to 
distinguish patients, the area under the receiver operating 
characteristic curve was calculated. 

III. Results 

1. General Characteristics of Patients 

We used sample data of 293 patients with chronic disease (120 
good adherence and 173 poor adherence results). The mean 
age of the patients was 73.8 years. Table 2 shows the socio- 
demographic and clinical characteristics of the patients. 

2. Development of the Logistic Regression Model 

Taking into account the 16 variables, the results of applying 
LR accuracy was 71.1%. Duration after diagnosis and self- 
efficacy are selected as significant variables in the LR model. 
The medication adherence rate of patients with duration 
after diagnosis of more than 10 years was 46% lower than 
those with duration after diagnosis of less than 5 years. For 
every unit higher in the self-efficacy score, the medication 
adherence rate rose by 27%. A complete list of study vari- 
ables in each variable set along with p-values are listed in 
Table 3. 

3. Development of the SVM 

To examine the characteristics of the patient samples with 
good and poor prognoses before performing the SVM ex- 
periments, we applied Principal Component Analysis (PCA). 
PCA transforms original features in a multivariate data set 
into salient features that are not correlated with each other 
[18]. Therefore, the original features, representing the pa- 
tient sample, can be reduced to a smaller number of new 
features, referred to as the principal components (PC). The 
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Table 2. Socio-demographic and clinical characteristics of the 
293 patients 

Characteristic Value 



Values are presented as number (%) or mean ± standard deviation. 



Age (yr) 74.4 ± 6.3 



Gender 


Men 


156 (53.2) 


Women 


137 (46.8) 


Spouse 


No 


101 (34.5) 


Yes 


192 (65.5) 


Educational level 


Illiterate 


69 (23.5) 


Elementary 


105 (35.8) 


Above middle school 


119 (40.6) 


Monthly income (1,000 Korean won) 


<1,000 


223 (76.1) 


>1,000 


70 (23.9) 


Job 


No 


226 (77.1) 


Yes 


67 (22.9) 


Duration after diagnosis (yr) 


<5 


72 (24.6) 


5-9 


75 (25.6) 


>10 


146 (49.8) 


Medication knowledge 


16.2 ± 3.8 


No. of medication type 


1-2 


99 (33.8) 


3-4 


133 (45.4) 


>5 


61 (20.8) 


Daily pill counts 


<5 


91 (31.1) 


5-9 


108 (36.9) 


>10 


94 (32.1) 


Side effects of medication 


No 


262 (89.4) 


Yes 


67 (22.9) 


Activity daily living 


Passive 


35 (11.9) 


Active 


258 (88.1) 


Perceived health status 


Bad 


149 (50.9) 


Fair 


93 (31.7) 


Good 


51 (17.4) 


Depression 


7.2 ± 2.5 


Health literacy 


8.3 ± 1.9 


Self-efficacy 


33.6 ± 5.0 


Medication adherence 


Non- adherent 


173 (59.0) 


Adherent 


120 (41.0) 
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Table 3. Logistic regression model 



Variable 


B 


SE 


p-value 


Exp(B) 


95% CI for Exp(B) 


Lower 


Upper 


Age 


-0.019 


0.025 


0.439 


0.981 


0.935 


1.030 


Gender (Men) 


-0.227 


0.342 


0.506 


0.797 


0.407 


1.558 


Spouse (No) 


-0.532 


0.348 


0.126 


0.587 


0.297 


1.161 


Educational level 


Illiterate 






0.762 








Elementary 


0.260 


0.405 


0.521 


1.297 


0.587 


2.867 


Above middle school 


0.076 


0.453 


0.868 


1.078 


0.444 


2.622 


Monthly income (<1 million Korean won) 


-0.413 


0.377 


0.273 


0.662 


0.316 


1.386 


Job (No) 


-0.236 


0.383 


0.537 


0.790 


0.373 


1.671 


Duration after diagnosis (yr) 


<5 






0.032 








5-9 


0.019 


0.404 


0.963 


1.019 


0.461 


2.252 


>10 


-0.774 


0.364 


0.033 


0.461 


0.226 


0.941 


Medication knowledge 


0.057 


0.047 


0.217 


1.059 


0.967 


1.160 


No. of medication type 


1-2 






0.910 








3-4 


-0.135 


0.390 


0.728 


0.873 


0.407 


1.875 


>5 


-0.005 


0.509 


0.992 


0.995 


0.367 


2.695 


Daily pill counts 


<5 






0.471 








5-9 


0.483 


0.403 


0.231 


1.620 


0.735 


3.570 


>10 


0.473 


0.481 


0.325 


1.605 


0.625 


4.119 


Side effects of medication (No) 


-0.071 


0.473 


0.881 


0.932 


0.369 


2.355 


Activity daily living (Passive) 


-0.031 


0.525 


0.953 


0.970 


0.346 


2.716 


Perceived health status 


Bad 






0.158 








Fair 


0.260 


0.346 


0.453 


1.297 


0.658 


2.553 


Good 


0.793 


0.413 


0.055 


2.209 


0.984 


4.960 


Depression 


-0.012 


0.065 


0.851 


0.988 


0.869 


1.123 


Health literacy 


0.003 


0.079 


0.968 


1.003 


0.859 


1.171 


Self-efficacy 


0.242 


0.040 


0.000 


1.274 


1.178 


1.377 


Constant 


-7.756 


2.746 


0.005 


0.000 







SE: standard error, CI: confidence interval. 



largest variance for the data set is set as the first axis (the first 
PC) in the coordinate system. Likewise, the second greatest 
variance is set as the second axis (the second PC), and so on. 
Applying PCA to the 16 features of the 293 patient samples, 
we could project the patients onto a three-dimensional space 
composed of PCs 1, 2, and 3. Figure 1 illustrates the distri- 
bution of the patient samples on this coordinate. From this 
figure, we expected to develop a SVM classifier that distin- 



guishes the patients with good and poor samples. 

To identify the variables that had the highest classification 
accuracy in medication adherence for chronic disease, we 
developed SVM with radial basis function (parameter C = 
1, Y = 1/number of features) that systematically searched 
through the space of subsets of variables, and evaluated the 
goodness of each variable subset according to the prediction 
accuracy. The variable subset showing the highest accuracy 
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was identified as the predictor set. Parameter C is the weights 
between empirical error and generalization error. Parameter 
y controls the shape of the separating hyperplane. 

Although an exhaustive enumeration search of variables 
can find an optimal solution, it requires extremely high 
computational cost to train and test SVM with each subset 
of variables. Thus, we employed sequential forward selection 
(SFS) search to deal with this difficulty [19]. The SFS is a 
heuristic greedy search that starts from an empty set of vari- 
ables, sequentially selecting a variable. 

We listed the top nine ranked variables selected by SVM 
and their prediction accuracies using a combination of the 
top ranked variables together in Table 4 to examine the 
above results in detail. The accuracy using a single variable 
selected was 72.4%; self-efficacy was selected, as in the LR 
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model. The present accuracy of the SVM reached 83.3% with 
two variables, self-efficacy and age. The highest accuracy, 
97.3%, was achieved with nine predictors: self-efficacy, age, 
depression, health literacy, medication knowledge, number 
of medication type, daily pill counts, duration after diagno- 
sis, and education level. 

Figure 2 shows the prediction accuracy when all 16 vari- 
ables were used for the prediction of medical adherence in 
the order of variables selected by SVM. The performance was 
very markedly decreased when more than 10 features were 
selected. Unlike our intuition that having more variables 
should give higher predictive performance, this example 
demonstrates that using a small number of variables can 
achieve higher prediction accuracy. 



* Positive o Negative 100- 




Figure 1. The Principal Component Analysis plot of 293 samples Figure 2. Classification accuracy for the 293 patients achieved 
(stars represent good adherence and circles represent with support vector machine (SVM) according to the 

poor adherence). number of input variables. 



Table 4. Combination of the top nine variables and classification accuracy 



No. of variables 


Combined variables in ranking order 


Accuracy (°/o) 


1 


Self-efficacy 


72.4 


2 


Self-efficacy, Age 


83.3 


3 


Self-efficacy, Age, Depression 


89.4 


4 


Self-efficacy, Age, Depression, Health literacy 


92.5 


5 


Self-efficacy, Age, Depression, Health literacy, Medication knowledge 


94.5 


6 


Self-efficacy, Age, Depression, Health literacy, Medication knowledge, Number of medication type 


96.6 


7 


Self-efficacy, Age, Depression, Health literacy, Medication knowledge, Number of medication 
type, Daily pill counts 


96.6 


8 


Self-efficacy, Age, Depression, Health literacy, Medication knowledge, Number of medication 
type, Daily pill counts, Duration after diagnosis 


96.6 


9 


Self-efficacy, Age, Depression, Health literacy, Medication knowledge, Number of medication 
type, Daily pill counts, Duration after diagnosis, Education level 


97.3 
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Table 5. Comparison between LR and SVM 



SVM Predictors of MA in Elderly with CD 
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LR 






71 1 




/ Kj.y 


64 9 


75 1 


Positive 


74 


44 












Negative 


40 


133 












SVM 






97.3 


95.8 


98.3 


97.5 


97.1 


Positive 


115 


5 












Negative 


3 


170 













LR: logistic regression, SVM: support vector machine, PPV: positive predictive value, NPV: negative predictive value. 



LRAUC = 0.78 
SVM AUC = 0.99542 



1.0 
0.9 
0.8 
* 0.7 

CD 

5> 0.6 
w 0.5 
| 0.4 
li 0.3 

0.2 

0.1 

0.0 

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 
False positive rate 

Figure 3. Area under the receiver operating characteristic curve 
(AUC) of the logistic regression (LR) and support vector 
machine (SVM) models. 



4. Comparison Between Prediction Models 

Table 4 compares the experimental results of LR and SVM 
using five evaluation measures. SVM showed better perfor- 
mance than LR in overall scoring categories, allowing identi- 
fication of predictor candidates to determine the most prob- 
able medication adherence of a patient. 

LR showed 71.1% accuracy when all 16 variables were used 
for the prediction of medical adherence (Tables 3 and 5). 
Compared to the result of LR, the result of SVM showed 
significantly higher accuracy, 97.3%, with only nine variables 
on the same patient samples (Table 5). This result indicates 
that SVM can achieve greater accuracy with a smaller num- 
ber of variables than the number of variables used in LR. It 
is interesting to note that the most significant variable (self- 
efficacy) selected by the SVM agrees with that selected by 
LR. When even a single variable (self-efficacy) was used by 
SVM, 72.4% accuracy could be achieved, which is higher 
than that achieved using all variables by LR. 

The results of the comparison of the discriminatory power 
of LR and SVM models are summarized in Table 5. 



The AUC indicates how well a prediction model discrimi- 
nates between healthy patients and patients with disease. The 
following guidelines have been proposed for interpretation 
of this area: 0.5-0.7, rather low accuracy; 0.7-0.9, moderate 
accuracies useful for some purposes; and >0.9, rather high 
accuracy [20]. Therefore, the classification accuracy of these 
models was good. 

Our results indicate that the SVM model has better diag- 
nostic capability than LR model. The AUC has achieved a 
good diagnostic power. Figure 3 shows the performance of 
the two models built. 

IV. Discussion 

Medication adherence is a complex phenomenon with many 
causes and correlates. In this analysis, self-efficacy was re- 
vealed as the strongest predictor of medication adherence. 
These findings suggest that providing clear instructions and 
responding to questions may increase patient confidence 
and knowledge, while enhancing their willingness to follow 
the treatment plan. Simple counseling strategies described 
in the literature on health behavior change can enhance pro- 
vider skills in building confidence and motivating patients 
[12]. Also, in this study, the number of medication types, 
daily pill counts, and duration after diagnosis were associ- 
ated with medication adherence in the SVM model. Those 
who took more medicines were more adherent. This finding 
contradicts the common medical dictum that more medi- 
cines lead to poorer adherence. However, other studies have 
had similar results [21,22]. Shalansky and Levy [21] found 
that patients in long-term treatment for cardiac disease had 
better adherence with more prescriptions. However, when 
multiple drugs are clinically indicated, one can be cautiously 
optimistic about a patient's ability to adhere to treatment, 
given appropriate instruction and support. 

Other predictors such as age, health literacy, education 
level, and medication knowledge were significantly related 
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to medication adherence in elderly patients with chronic 
disease. Especially, health literacy is the ability to obtain, 
process, and understand health information to make appro- 
priate health decisions [23] . Studies in various patient popu- 
lations demonstrate an association of limited health literacy 
with poorer health-related knowledge including medication 
knowledge. Also, health literacy was associated with older 
age and educational level [23,24]. Therefore, a broader un- 
derstanding of these relationships will facilitate the develop- 
ment of targeted interventions to improve medication adher- 
ence, quality of care, and outcomes in patients with chronic 
disease. 

Also, we found that depression was associated with medi- 
cation non-adherence in elderly outpatients with chronic 
disease. Ziegelstein et al. [25] reported that depression was 
associated with taking less prescription medication, but it 
was unclear whether the frequency of "taking prescription 
medication" was measuring adherence or the number and 
frequency of medications prescribed. 

The current study has several limitations, which have to 
be improved for prospective studies in prediction model- 
ing. First, in this study, self-reported adherence was assessed 
once and not longitudinally. Further studies should explore 
the use of multiple adherence assessment method like pro- 
vider's report and the Morisky medication adherence scale, 
which could be compared and aggregated to get a single ad- 
herence estimate. 

Second, the previous study of Son et al. [26] reported that 
the medication knowledge variable was an important predi- 
cator of medication adherence in heart failure patients. Self- 
efficacy and medication knowledge have important implica- 
tions for clinical care quality. In future studies, we need to 
study how they affect predictability by identifying the mean- 
ing and the scale level of the variables that are important 
predictive factors. In-depth studies about feature normaliza- 
tion, discretization, factor analysis, and detailed univariate 
analysis will be needed. 

Third, the cross validation method used the same data as 
the test data and the training data for this study, so a higher 
classification rate than the actual rate can be seen. Thus, 
future studies will be able to get more accurate results by en- 
suring that the test data and the training data are separated 
in advance. In further research, if there are many samples, 
we may have to consider other ways such as 10-fold cross 
validation, leave-one-out cross-validation, (LOOCV) etc. 

To our knowledge, ours is the first study to examine the as- 
sociation between socio-demographic factors, medication, 
clinical factors, and psychosocial factors, including depres- 
sion, health literacy, medication knowledge, and self-efficacy 
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and medication non-adherence in patients with chronic 
disease using SVM models. The knowledge of these predic- 
tors will also inform the development of interventions which 
target a higher-risk subset of older patients with chronic dis- 
ease. 

Furthermore, SVM showed higher classification accuracy 
than LR, because it establishes the optimal classifier to maxi- 
mize the geometric margin between samples and therefore 
minimize empirical classification errors. We expect that 
SVM will serve as an effective alternative to conventional 
LR in identifying the key variables to show the highest clas- 
sification accuracy, thereby creating a valuable diagnostic 
program for medication adherence prediction. 

The research is not finished when a good model is found; 
the model must be included within some clinical informa- 
tion or decision support system. If possible, a cost or benefit 
study should also be done. 
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